arXiv:1505.01983v2 [math.NA] 10Jun2015 


The Schwarzian-Newton method for solving nonlinear 
eqnations, with applications 

Javier Segura * 

Departamento de Matematicas, Estadistica y Computacion, 

Univ. de Cantabria, 39005 Santander, Spain, 
e-mail: javier.segnra@unican.es 


Abstract 

The Schwarzian-Newton method can be defined as the minimal method for solving 
nonlinear equations f{x) = 0 which is exact for any function / with constant Schwarzian 
derivative; exactness means that the method gives the exact root in one iteration for 
any starting value in a neighborhood of the root. This is a fourth order method which 
has Halley’s method as limit when the Schwarzian derivative tends to zero. We obtain 
conditions for the convergence of the SNM in an interval and show how this method 
can be applied for a reliable and fast solution of some problems, like the inversion of 
cumulative distribution functions (gamma and beta distributions) and the inversion of 
elliptic integrals. 
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1 Introduction 

The problem of inverting functions is one of the central problems in numerical analysis, with 
uncountable applications. There exists a vast literature on numerical methods for solving 
nonlinear equations and many different methods with different properties are available; partic¬ 
ularly, there is a considerable interest in building high order methods with the highest possible 
efficiency index m- 

The study of non-local convergence properties is, however, not so common. But without a 
knowledge of these properties, the use of fast numerical methods will require accurate initial 
estimations in order to guarantee convergence. There is, in addition, the general rule that 
the higher the convergence rate, the more unpredictable the method will be when accurate 
initial approximations are not available. This is probably one of the reasons why high order 
methods are not so common in applications; bisection, secant and Newton’s methods are the 
most popular followed closely by Halley’s method (order 3) and other third order variants. 

*The author acknowledges financial support from Ministerio de Economfa y Competitividad (project 
MTM2012-34787) 
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The ideal situation would be that of a high order method with good and manageable 
non-local convergence properties; however, there is little (if any) information on the non-local 
behaviour of methods of order larger than three, particularly concerning verifiable conditions 
for convergence in an interval. In this article, we propose the Schwarzian-Newton (SNM) which 
can dehned as the minimal method for solving nonlinear equations f{x) = 0 which is exact for 
any function / with constant Schwarzian derivative, meaning that the method gives the exact 
root in one iteration for any starting value in a neighborhood of the root. This is a fourth 
order method which has Halley’s method (HM) as limit when the Schwarzian derivative tends 
to zero. As happens for the HM [smi], the SNM has a simple geometrical interpretation in 
terms of osculating curves. 

The SNM will be proven to converge (monotonically) in less iterations that Halley’s method 
(HM) for functions with negative and monotonic Schwarzian derivative (decreasing in the direc¬ 
tion of the converging sequence). For positive Schwarzian derivative, the SNM also converges 
under similar monotonicity conditions, while the HM may not converge. We obtain conditions 
for the convergence of the SNM in an interval and show how this method can be applied for 
a reliable and fast solution of some problems, like the inversion of cumulative distribution 
functions (gamma and beta distributions) and the inversion of elliptic integrals. 


2 The Schwarzian-Newton method: definition, geometric 
interpretation and convergence properties 

Let / be sufficiently differentiable. Our aim is to solve f{x) = 0. We write 

fix) + B{x)f{x) = 0, Bix) = -fix)/fix), (1) 


which is a second order ODE that we can transform to normal form by setting 


^ = exp ( i [ B] f = 


f 


V\f\ 


This leads to 

with 


- 1 - = 0 


n = --B^ - -B' 
4 2 


2 \f' 





( 2 ) 

(3) 

(4) 


where {f,x} is the Schwarzian derivative of / with respect to x. 

As is well know, the application of Newton’s method to the function $ leads to the HM [3]: 


Xn+l = g{Xn) = Xn 


^{Xn) _ 
^'(Xn) 


f'{Xn) 


f{Xn) 


2/'(a;„) 


f{Xn) 


(5) 


This is a third order method. This is easy to check by considering that if $(a) = 0 then 
$"(a) = 0 (provided H is bounded at a), and then g'{a) = g''{a) = 0. 
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An improvement in the order of convergence can be obtained by integrating approximately 
the differential equation dS]) (or the associated Riccati equation), similarly as in [12]. This gives 
a fourth order fixed point method. 

We start from the Riccati equation satisfied h'{x) = 1 + VL{x)h{x)'^, h = $/$'. Now let a 
such that h{a) = 0 (and then f{a) = 0 ) and assume for the moment that il{x) > 0 , we have: 

I _ 

x — a= / -^- dtK, — a,rcta,n(-v/0(.T)h(.7:)). (6) 

l + r!(t)h2(i) K ) y n 


where the approximation consists in taking Q.{x) constant in the integration. From this, we 
obtain an approximation for a. We can iterate these approximations and obtain the fixed point 
method Xn+i = g(xn) with 


where 


qix) = X -;= arctan ( VQ —7 | , 

Vn \ ^ J 


$ 

i' 


/ 


/ 


B f" ■ 

2 /+/' r-y 


(7) 


In other words, 
passing through 
For the case 
can write: 


the method consists in computing the solutions of h'{x) = 1 + n(x„)/i(a;)^ 
the point {Xm h(xn)) and then computing Xn+i from h{xn+i) = 0 . 
n < 0, and using arctan(-\/A)/\/A = arctanh('v/—A )/\/—A when A < 0, we 


q(x) = X -;^^arctanh ( \/|n |—7 ) . 

\^l V ^ J 


( 8 ) 


The iteration functions o and dH) define the SNM, Xn+i = g{xn)] we observe that the 
SNM has the HM as limiting case for {/, a;} —>■ 0. 


Remark 1 The SNM is exact for functions with constant Schwarzian derivative because the 
approximate integration in m becomes exact in this case. In this sense, it is the analogous to 
the standard Newton method, which is exact for functions with constant ordinary derivative; 
this is why we call it Schwarzian-Newton method. 


A straightforward computation shows that the SNM satisfies: g'[a) = g"{a) = g'"{a) = 0 
while g^'^'^ia) = 211.'{a) (where /(a) = 0 ); this implies, denoting the errors by Ck = Xk — a: 

_ fl (a) 4 \ 

Cn+l — ^2 

Therefore, this is a fourth order method involving third order derivatives. The same would be 
true, for example, for a fourth order Newton-like method (for example of the type considered 
in H); the essential difference will be in the good non-local convergence properties of the SNM 
(see Theorem [Hand Corollary [T]). 

The SNM is related to the method described in [T^, which is a method for computing 
zeros of functions which are solution of second order differential equations. Differently, the 
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SNM can be applied to the inversion of any function provided the Schwarzian derivative can 
be computed; therefore, it requires that the first three derivatives of the function are available. 

As we mentioned, the SNM can be interpreted as an improvement of the HM. Next we 
study in more detail the connection of the SNM with the HM, particularly with respect to 
their geometrical interpretation. 


2.1 Geometrical interpretation of the Halley and Schwarzian-Newton 
methods 


For brevity, let us denote 


tan(A,a;) = —=tan{VXx) = 
V A 


—^ tan(-\/Ax) 
V A 


A > 0, 
A = 0, 


} tanh(-\/—Ax) , A < 0, 
V-A 


(9) 


and similarly for the inverse function arctan(A,x) = arctan(-\/Ax)/-\/A. In terms of this, the 
SNM reads 


/ 


g{x) = X — arctan 


t f 




V 


( 10 ) 


/ 2 /'-^/ 


It is easy to check that the more general functions for which the Schwarzian derivative is 
constant are 

tan(A, x) + A 


h{x) = 


( 11 ) 


i3tan(A,x) + C" 

with {h,x} = 2A. This can be checked by direct integration of the differential equation {h, cc} = 
2A, A constant. 

In particular, the most general function with zero Schwarzian derivative derivative is 

X -\- A 


h{x) = 


Bx + C 


( 12 ) 


which is the set of functions for which the HM is exact. This is consistent with the fact that 
the SNM has the HM as limiting case when {f,x} —>■ 0. 

Precisely because the HM is exact for functions of the type (USD, a way to obtain the HM for 
computing the roots of f{x) = 0 is by considering an osculating curve. We have the following 
well-known result: 


Theorem 1 Let h(x) as in ilS\) and define y{x) = h{x — Xn), the HM ^ is obtained by setting 
y{xn) = f{xn), y'{xn) = f'{x„), y"{xn) = f''{xn) and 2 /"'(x„) = f"'{xn) (thus determining 
the three constants) and obtaining Xn+i from y(xn+i) = 0. The three constants are given by 


H = ‘^fMf'jxn) ^ _ f"ixn} ^ _ 2f (x„) 
D{Xn) ’ D{Xn) ’ D{Xn) 

where 

D{Xn) = 2f'{XnY - f{Xn)f"{Xn) 


(13) 

(14) 
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Because of this result, the HM is also known as the method of tangent hyperbolas m- 

Similarly, because the most general function for which the SNM is exact is (HU, an alter¬ 
native way of construction of the method in terms of osculating curves becomes available, as 
the next theorem shows. 

Theorem 2 Let h{x) as in f77]) and define y{x) = h{x — Xn), the SNM Hdi) is obtained 
by setting y{xn) = fix„), y'{xn) = f'{xn), y"{xn) = /"(a;„) and y'''{xn) = f"'{xn) (thus 
determining the four constants) and obtaining Xn+i from y{xn+i) = 0. The constant A is 
given by 

X = n{x„),n{x) = ^{f,x} ( 15 ) 

and the other three constants A, B and C are as in Theorem\J\ 

Proof. The four conditions on y{x) give 

f{Xn) = f'{Xn) = ^ 

The method is obtained by setting y(xn+i) = 0, which gives 

Xn+i = g{xn) = Xn - arctan(A, A). 

and this function g{x) will be the same as (jTU)) if 

}_ _ f'jXn) _ f'jXn) 

A f{Xn) 2/'(x„)’ 

and 

l\ r{Xn) ^( f"{XnW \ 

2 \/'(Xn) ^Kf'Mj /’ 

which is immediate to check using (HU). 

And using these last two relations together with (fTOl) we readily obtain the coefficients. □ 

2.2 Non-local convergence properties of the Schwarzian-Newton method 

Before analyzing the non-local convergence properties of the SNM, we recall a result of con¬ 
vergence in an interval for the HM, which is limiting case of the SNM. 

Theorem 3 Let f with f' 0 and f" continuous in an interval J and let a € J such that 
f{a) = 0. Then, if {f,x} < 0 in J the HM converges monotonically to a for any starting value 
xq G j. 

Proof. Let us consider that /' > 0 (if /' > 0 we can proceed with the substitution f ^ —f). As 
discussed before, the HM is equivalent to the application of Newton’s method to the function 
^ = f / y/T- We notice that 
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and because we are assuming that {f,x} < 0 then $(x)$"(a;) > 0 for all a; S J\ {a}. 

On the other hand, because /' > 0 then $' > 0. Indeed, /' > 0 implies that < 0 if 
X < a and $(a:) > 0 if cc > a (and the same for $"(a;) > 0 because $(a;)$"(a:) > 0 for all 
X G J \ {a})- Then $'(a) > 0 and because $"(a;) > 0 for x > a and < 0 for x < a we 

have $'(x) > 0 for all x G J. 

Therefore, $ is strictly increasing in J and $(x)<I>"(x) > 0 for all x G J \ {a}, which 
guarantees monotonic convergence of the Newton method to the zero a of the function $. □ 

For an alternative formulation and proof of the Theorem [3] see [5] . Similar results and 
geometrical proofs for other related third order methods can be found in [ 1 ]. 

From the proof of Theorem [Sj it is clear that in the case of positive Schwarzian derivative 
it is not possible to guarantee convergence in an interval because the convexity properties of 
$ are opposite to those which guarantee convergence of the Newton method. Contrarily, we 
will see that conditions of non-local monotonic convergence can be found for the SNM both 
for positive and negative Schwarzian derivative, but we will need an additional monotonicity 
condition. 

Remark 2 From now on, we will consider that the hypotheses of theorem\^ (f ^ 0 and f" 
continuous in an interval J and with a G J such that f{a = 0)) hold in the subsequent results. 

Before proving the convergence theorems for the SNM it is important to analyze the be¬ 
havior of the function h(x) = $(x)/$'(x) depending on the sign and monotonicity properties 
of the Schwarzian derivative. 

Lemma 1 Under the hypotheses of Remark\^ the following holds: 

{fy^} > 0 h{x) is strictly increasing when it is defined and it may have one or two 
singularities (one smaller and one larger than a). 

2. If {f,x} < 0, h(x) is strictly increasing if \h(x)\ < |n(x)|“^^^, n(x) = ^{/,x} and 
strictly decreasing if |h(x)| > |f 2 (x)|“^/^ (when it is defined). h{x) as at most one zero 
and at most one singularity. 

3. Let {/,x} < 0 and decreasing (increasing) in J. If X- is such that h'{x-) < 0 then 
h'(x) < 0 and h{x) 0 for all x G J greater (smaller) than x_. 

The proof of Lemma [T] follows easily from graphical arguments. We give the proof in the 
Appendix. 

Using the these properties of the function we can now prove results for the convergence of 
the SNM in intervals. 

Theorem 4 Under the hypotheses of Remark\^ the following holds: 

1. Let {/, x} be decreasing in I = [a, a] C J. //{/, x} < 0 in J then the SNM converges 
monotonically to a for any starting value xq G [a, a]. // {/, x} > 0 in part of the interval, 
the same is true if, in addition, the SNM iteration satisfies g{a) > a. 

2. Let {/, x} be increasing in I = [a,b] C J, If {/, x} < 0 in J then the SNM converges 
monotonically to a for any starting value xq G [a, b]. If {/, x} > 0 m part of the interval, 
the same is true if, in adddition, the SNM iteration satisfies g{b) < b. 
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Proof. We prove the case l.a. The case l.b is proved in a similar way. 

First we have to prove that the SNM is defined for all x G [a,a]. For this, we need 
\il.{x)h{x)‘^\ < 1 in [a, a] (so that the arctanh function in Eq. (jS]) is defined); but this is 
necessarily so, because if there existed a X- such that |/i(a;_)| > then h'{x-) < 0 , 

and because {/,x} < 0, Lemma [T] guarantees that h{x) 0 for all x > x_, in contradiction 
with the fact that h{a) = 0 , x_ < a. 

The monotonic convergence follows from the fact that the method consists in computing 
the solution of 

h'(x) = 1 + rt{xn)h{x)^ (17) 

passing through the point (x„,/i(x„)), where /i(x) = $(x)/$'(x) satisfies 

h'(x) = 1 + n(x)/i(x)^, (18) 

and then computing the next iteration x^+i from h(xri+i) = 0. Now, given Xn G [a, a) and 
because n(x) < fl{xn) for x„ < x < a, the graph of the solution of (fT71) lies above the graph 
of h{x), which is solution of (IT^ : as a consequence the function h crosses the x-axis before 
h{x) does. Therefore x„+i < a, but also x„ < x„+i because h{xn) < 0 and this implies 
that Xn+i = g{xn) > Xn- We have a monotonically increasing sequence bounded by a and 
converging to a which is the only fixed point of g(x) in J. 

For n(x) > 0 the same proof is valid except for the fact that we must guarantee that h[x) 
is defined for all x in (o,a), for which we need that no value Xoo G {a, a) exist such that 
d’^(xoo) = 0. But because h{x) is monotonically increasing for f2(x) and there are no zeros 
of h{x) in (a, a), if such value Xoo existed h{x) — $(x)/<i)'(x) would change sign at Xoo an 
therefore h{a) > 0; in that case g{a) < a. Therefore, if g{a) > a, h{x) is defined for all x in 
(a, a) (and also in x = a because we are assuming that g{a) is defined). □ 

As corollary of the previous theorem we have: 

Corollary 1 Under the hypotheses of Remark\^ '^f {fix} has one and only one extremum at 
Xe G J and it is a maximum, then 

1. //{/, x} is negative the SNM converges monotonically to a starting from xg = Xe 

2. If (xe — a){xe — g{xe)) > 0 the SNM converges monotonically to a starting from xg = Xg- 

Observe that the the second result does not depend on the sign of {/, x}. 

We end this section with a comparison of the convergence properties of the HM and the 
SNM. We wrote the SNM x„+i = g(x„) as g(x) = x — arctan(0(x), /i(x)); the HM corresponds 
to g(x) = X — arctan(0, h{x)). And because arctan(A, 1) is decreasing as a function of A > —1, 
we have, when g{x) is real, that 

I arctan(H(x),/i(x))| > | arctan( 0 ,/i(x))| if H(x) < 0 


and 

I arctan(H(x),/i(x))| < | arctan(0,/i(x))| if H(x) > 0 

As a consequence 


7 


Theorem 5 The steps of the SNM (xn+i — Xn) are of the same sign and greater (smaller) in 
absolute value than those for HM when {f,x} is negative (positive). 

For the case of negative {f,x} this means that, when Theorem |4] and Corollary [1] hold, the 
HM also converges monotonically. This is as expected on account of Theorem [31 And even 
more interestingly: 

Corollary 2 If {f,x} < 0 in J and {f,x} is decreasing in I = [a, a] G J (or increasing 
in I = [a, 6 ] C J), f{a) = 0, the SNM converges monotonically to a (within a prescribed 
accuracy) in less iterations than the HM for any xq € I. 

Contrarily, for the case of {/, a:} > 0 the steps of the SNM are smaller in absolute value 
than those of the HM. In this case, there are no convergence results for the HM in an interval; 
the HM may not converge. For instance, considering the trivial case of computing the root 
of f{x) = tan(x) in (—7r/2,7r/2) the SNM is exact because {/, a;} is constant (H(a;) = 1 and 
h{x) = tan(a;) and g{x) = x — arctan(tan(a:)) = 0,) while the HM is given by g{x) = x — tan(a:) 
and gives values outside {—ttI2,tt(2) if xq is close to ±7r/2. 


3 Applications 

3.1 Unimodal cumulative distribution functions 

We consider two examples of functions defined as 

f{x) = / p{x),x e [a, 6 ], 

J a 

with p{x) > 0 and normalized in such a way that f{b) = 1. In this case f{x) is a cumulative 
distribution function with probability density function p{x). We call this unimodal if p{x) has 
only one relative extremum in [a, b] and it is a maximum. This functions f{x) have a sigmoidal 
shape with an inflection point at the maximum of p(x). 

Let us recall that in the set of functions with constant negative Schwarzian derivative, we 
find functions with a sigmoidal behavior, as is the case of tanh(a;); and as we discussed earlier, 
the SNM can be interpreted in terms of osculating curves with sigmoidal form for approximating 
the inversion problem. This seems adequate for these type of cumulative distributions and 
provides a first hint regarding why the method can be particularly useful in these cases. 

We provide some examples of application of the method for cumulative distributions such 
as the central gamma and beta distributions. As we will see, the indications that the method 
could have good global convergence properties will be confirmed. We start with the case of the 
central gamma distribution. 


3.1.1 The central gamma distribution 


As in [To] we denote 


P(a, x) 


T{a) 


, Q{a,x) 
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T{a) 


(19) 




where 


nX /* + 00 

7 (a,x) = / T{a,x) = / 

Jo Jx 

P is the lower tail gamma distribution and Q is the upper tail; we have P + Q = 1, P G [0,1], 
Q G [0,1]. P is increasing, Q decreasing and they are both sigmoidal functions with inflection 
point at a; = o — 1. Close to this inflection point (particularly for large a) the values of P and 
Q are similar. 

The problem is either to invert P{a, x) — p = 0 or Q{a, x) — q = 0 {q = 1 — p); for obvious 
reasons, it is better numerically to invert the first equation when p < 1/2 and the second in 
the other case. This is particularly important when p (or q) is very small. We take as function 
problem f(x) = P{a, x) — p or f(x) = q — Q(a, x) (which have the same derivatives). In both 
cases we have 

f\x) + B{x)f'{x) = 0, B{x) = 1 + -—- 

X 

Considering the transformation ([5]) we arrive at 

$"(a:) + fl(x)$(x) = 0 , $(x) = ( 20 ) 

where 

= + + ( 21 ) 

We consider a > 0, starting with the case a > 1. The case a = 1 is trivial and exact for our 
method (fl constant). 

We observe that for a > 1 17 (ac) < 0 for all x > 0 and therefore we are in the case of negative 
Schwarzian derivative. Indeed 17(0+) < 0, 17(+oo) < 0 and 

^'(x) = -a) + a^- 1 ). 

Therefore the only relative extrema is at x^ = a + 1 and it is a maximum, where 17 (a;^) = 
-(4(l + a))-i < 0. 

Then, the fixed point method is ([5]) with 17(a;) given by (I^Tl) and 


<i)(x) 


1 

2 



1 — a 

X 



r(a) 


where f{x) = P{a, x) — p oi f(x) = q — Q{a, x). 

Convergence is monotonic starting from xq = Xm = a + 1 (see Corollary [T|) . 

It is instructive to compare the performance of the SNM with HM. We know, from Corollary 
m that the SNM will converge (within a prescribed accuracy) in no more iterations than HM, 
furthermore, it has larger order of convergence. In addition, the SNM is exact for a = 1 and 
therefore it is also extremely efficient for values of a close to 1. 

We also provide graphical evidence of the superiority of the SNM by plotting the osculating 
curves at a; = Xm (see Figure[T]); as the figure shows, the osculating curve for the SNM is much 
closer to the function P(a,x) than the corresponding curves for the Newton method and the 
HM. 
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Figure 1: Left: plot of the function P(a, x) for a = 30 (solid line), together with the osculating 
curve at a; = a + 1 for the SNM method (dashed line); the horizontal lines P = 0.05 and 
P = 0.95 are are also shown (dotted lines). Right: plot of P{a,x) for a = 30 (solid line) and 
the osculating curves at a; = a + 1 for the SNM (dashed), HM (circles) and Newton method 
(dotted). 


For 0 < o < 1 the relative extremum Xm = a + 1 of fl{x) is no longer a maximum but a 
minimum; n{x) is decreasing for x < Xm and increasing for x > Xm- We can maintain the 
previous iteration function but we can no longer use as starting value Xq = Xm] instead, we will 
have monotonic convergence starting from either large enough x or small enough x, depending 
on the value of p. Additionally, VL{x) becomes positive for small enough x and the resulting 
algorithm becomes slightly more complicated. 

A way round is to consider a change of variables so that the transformed function has the 
same properties as before: negative Schwarzian derivative and only one extremum (a maximum) 
at most. In particular, we can consider the changes 


J x^/m, m > 0 
( log(x), TO = 0, 


( 22 ) 


as done in [4]. In this new variable, the function f{z) = f{x{z)) {x{z) the inverse function of 

z{x)) is such that $(z) = f{z) satisfies $(z) + n(z)$(z) = 0 where dots mean derivative 

with respect to z and 

^(z) = - 2m ~ 2 (q - l)a: + - TO^), 

0(z) = ^xx~'^"^~^ {{m — Vjx"^ + (2to — I)(I — a)x + m{a? — m^)) 

(see [3], section 4). Two interesting cases are to = 1/2 and to = 0; the case to = 1 {z{x) = x) 
was considered before. 
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For m = 1/2, ^{z) is negative for all a; > 0 (and then z > 0) if a > 1/2, with a maximum 
at a: = — 1/4 {z = 2{a^ — 1/4)^/^^). The situation is then similar as before and we can 

perform the inversion with starting value corresponding to this maximum if a > 1 / 2 . 

For m = 0 {z{x) — logx), n(z) is negative for all a; > 0 (and z G (—oo,+oo)) if a > 0. 
The only extremum corresponds to a: = a — 1, that is Zg = log(a — 1). For a < 1, 0(z) is 
strictly monotonic in R (because Zg ^ R) and it is decreasing. Therefore, considering a value 
of z < log a (/(a) = 0), convergence is guaranteed. For a > 1, V,{z) is negative and with a 
maximum at z = log(a— 1), and then the convergence properties are similar to the case to = 1 . 

Therefore, combining the case to = 1 for o > 1 with to = 0 for a G (0,1) (to =1/2 
for a > 1/2 is also possible) we have a reliable and fast method of inversion of the gamma 
distribution. 

A recent algorithm for the inversion of the gamma distribution can be found in [ 8 ]. The 
approach was to compute sufficiently accurate initial approximations, depending on the range of 
parameters in order to ensure convergence for a high order Newton method. The SNM provides 
a simpler and efficient method of computation with guaranteed convergence, particularly for 
not to small p or q = 1 — p; for instance, when 0.1 < p < 0.9, three iterations are enough for 
20 digits accuracy starting with xq = a — 1. The use of the starting values considered in [5] 
can, of course, improve the performance, particularly for small p or q. 

The noncentral gamma distribution Qfj,{x, y) (which becomes the central distribution (5(/r, y) 
for a; = 0) can also be inverted using the SNM. In particular, in [ 8 ] we applied this method 
for inverting Qi/ 2 {x,y) = q, both with respect to x with y fixed and with respect to y with x 
fixed, where 

Qi/2{x, y)= 2 (®rfc(Vy + Vx) + erfc(^ - v^)) . 


3.1.2 The central beta distribution 


As another example of the inversion of cumulative distributions, we consider the cumulative 
beta distribution 


Ix{a,b) 


B{a, b) 


f ^(1 —t)^ ^dt, X € [0,1], B{a,b) 

Jo 


r(a)r(6) 

F(a + b) ’ 


(24) 


with complementary function Jx{a, b) = 1 — Ix{a, b). The problem is to invert f{x) = Ix{<i, b)—p 
(or f{x) =q - Jx{a, b),q = l- p). 

In this case we have 


B{x) 


a—1 I 6—1 

X 'X I - X' 


f2(x) 


(g - 1)(6- 1) _ 1 - 1 _ 1 6^- 1 

2 x(l-x) 5 5 (1 - xf 


(25) 


For the moment let us consider a > 1 and 6 > 1. Because 11(0+) = —oo, fl(l“) = —oo 
and n(x) is differentiable in (0,1), it has at least one extremum in (0,1). Now we check that 
when a and 6 are greater that one there is exactly one extremum and it is is a maximum; in 
addition, n(x) is negative in (0,1). This means that, similarly as in the case of the gamma 
distribution, convergence of the SNM can be guaranteed by choosing as starting value the value 
of X corresponding to this maximum. 
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That fi(x) is negative follows from the observation that the equation Q(x) = 0 does not 
have real roots if a > 1 and 6 > 1. This is easy to check by writing 

D = (a + /3)(a +/3 + 2), E = -(2a^ + 2a/3 + 4a), F = a(a + 2) 

where a = a — 1 and {3 = b — 1. The discriminant is A = — 4DF = —2af3{a + (3 + 2) < 0 

and therefore there are no real roots. 

For proving there is only one relative extremum, we compute Q'(x) 

fi'(x) = — x)~^Q(x), Q(x) = Gx^ + Hx'^ + Ix + J, 

G = (a + l3){a + (3 + 2), E[ = —3(q;^ + al3 + 2a), 

I = 3a^ + a/3 + 6a, J = —a(a + 2) 

Then, by Descartes rule of signs we see that Q{x) has either 1 or 3 real roots. But if it had 3 
roots then Q'(x) should have two real roots. But it is easy to check that the equation Q'{x) = 0 
does not have real roots when a > 0, /3 > 0. 

The real root Xm of Q{x) (which gives the abscissa of the maxima of D(a:)) can be computed 
using standard formulas for solving the cubic equation Q(xm) = 0. 

Same as for the gamma distribution, the changes of variable in [1] can be used to deal with 
other parameter cases. For instance, with the change z{x) = \og{x/{l — x)) we arrive to 

^{z) = ^(—(a + b){a + b— 2.)x{z)'^ + 2(a + b){a — l)x{z) — a^) (26) 

which is always negative for x S [0,1] (a and b are positive). For a < 1, & < 1 D( 2 ;(x)) has 
a minimum at Xm = (ci — l)/(a + & ~ 2), for a > 1 and b > 1 there is a maximum at Xm, 
while for the rest of cases the function is D is monotonic (for x € [0,1]) . The analysis of the 
monotonicity properties is more simple than in the previous case without change of variables. 
Furthermore, it appears that in some cases this alternative method is more effective [6]. 

3.2 The incomplete elliptical integral of the second kind 

In the previous two examples we considered functions with negative Schwarzian derivative, 
which is the case of simpler application of the SNM (see Theorem 2] and Corollary [IJ. But if it 
becomes positive, the SNM can be also applied if the monotonicity properties of the Schwarzian 
derivative are available. 

As an example of this, we consider the inversion of 

f(x)=E(sm(x),m)—pE(l,m)= f \/1 — vn? sin^ tdt — pE{l, m) (27) 

Jo 

with respect to x, where x G [0,7r/2], 0 < m < 1, p G [0,1]. The function is increasing in the 
interval [0,7r/2] and /(O)) = —pE{l,m) < 0, /(7r/2) = {l—p)E{l,m) > 0, therefore it has one 
and only one zero in this interval. 

The inversion of this function was recently considered in ElE]; we later discuss the advan¬ 
tages of our approach. 
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Proceeding as before we have 


i}{x) 


m? w? cos^ X + {w? — 4) cos^ x + 2(1 — w?) 
4 (1 — sin^ a:)^ 


(28) 


Differently to the previous cases, D(x) changes sign and V,{x) < 0 if a; < Xc{m) while 
fl{x) > 0 if a: > Xc{m) where 


Xc{ra) = arccos 


I /d— 

IV 2^^? 


This means we should use © when X > Xc{m) and (H)) when x < Xc{rn) (or the general iteration 
function (| 10 p l. As a function of m, Xcim) is increasing with a;c(0) = 7 r /4 and Xc{^) = 7 r/ 2 . 
Differentiating, 


D'(a:) 


sin X cos x 3to^ (2 — m^) cos^ x + — 4 

2 (1 - sin^ a:)^ 


(29) 


Therefore we have D'(a;) = 0 at x = 0, a: = 7 r /2 and at Xg if m > 2/-\/7 with 

2 — 4 7to^ — 4 

cos Xg = -;i- TT- = 1 - T-, -5T- 

3to^ — 6 to^ 3m^(2 — m?) 

It is easy to see that Xg < Xg and therefore D(xg) < 0 and D(x) reaches its minimum at x = Xg. 
In the other case, when m < 2/-\/7, D(x) is monotonically increasing in ( 0 , 7 r/ 2 ). 

Therefore, considering Theorem |4] for m < ^j^/l the SNM converges monotonically to the 
root, starting with xq = g{'n 12)^ where 


5^2) = I 


V^2(l - 111?) 
m 


arctan 


\ 72(1-m^) )' 


It turns out that 0 < g{'xl2) < -n 12 for m G (0,1), p G (0,1). 

If TO > 2 / 77 , we also have monotonic convergence to the root a starting from xq = g{-K/ 2 ) 
if a > Xg; contrarily, if a < Xg we have monotonic convergence starting from xq = 5(0), because 
D(x) would be decreasing between 0 and a. We have 

g(0) = —arctanh ( - ^ - 

m V 72 

For TO > 2/ s/tt the SNM will converge monotonically with one of the two starting values 
a^o = 5 ( 0 ) or Xq = 5 ( 71 / 2 ). It does not seem easy to determine a priori which of the two is 
the correct selection, except in some cases. For example if 5(0) > 5 ( 71 / 2 ), 5 ( 71 / 2 ) is the best 
option. Even when this fact is not determined a priori, one can build a very efficient algorithm 
using these starting values. 

The value 5(0) turns out to be a good approximation for not to large values of p and it 
is generally better than 5 ( 77 / 2 ). The reason for this is that D(x) has slower variation near 
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cc = 0 than near x = 7r/2. We have observed that considering g{Q) as starting value instead of 
(/(7r/2) is the best option when g{0) < g(7r/2) and p not too large (say, p < 0.8). This can be 
complemented with a simple approximation when m is very close to 1 (say m > 0.95) using 
that E{smx, 1) = sin a;). 

Using these approximations as starting values, we have checked that a relative accuracy 
close to 10“^® can be obtained with only two iterations, and that the accuracy is better than 
10“^° in two iterations if m is smaller than 0.8. This is clearly better than the 10“^° accuracy 
in three iterations of [5]. Notice, in addition, that our algorithm quadruplicates the number 
of exact digits in each iteration while the algorithm in [5] uses Newton-Raphson, which only 
duplicates it. In alternative methods are discussed which use accelerated bisection improved 
with the Halley method; our algorithm does not need acceleration because it is a fast high order 
method from the start (faster than Halley method). 

Similar ideas can be used for the inversion of other incomplete elliptic integrals. In partic¬ 
ular, the case of the incomplete elliptic integral of the first kind is very similar. 


Appendix: proof of Lemma [T] 

Proof. We know that h'{x) = 1-1- n{x)h{x)^ and the hrst two statements concerning the 
monotonicity are obvious. 

For the case fl{x) > 0 the function is always increasing when it is defined, and necessarily 
the zeros and singularities interlace. But because there is only one zero of h(x) in J there can 
be two singularities of h{x) at most. 

For the case ^l(x) < 0, h{x) is increasing in a region symmetric around the a;-axis (|h(a;)| < 
|fl(a;)|“^/^) and decreasing outside that region. Then, if it has a zero a it is increasing at the 
zero, and it is easy to check by graphical arguments that h(x) can not cross the a;-axis again for 
X > a or X < a. Similarly, proceeding with h = —$'(x)/$(a;) and because $"-I-H(a:)$(a;) = 0, 
we have h' = Q + h, which also increasing in a band around the a;-axis. Following the same 
argument as before, h as one zero at most. Therefore h{x) has one singularity at most. 

As for the third item, we consider the case of decreasing {/, x}] the second case is analogous. 
We prove that h{x) is decreasing when it is dehned for x > X- and therefore that h{x) ^ 0 {h{x) 
is necessarily increasing at its zeros, as the second statement of this theorem confirms). Because 
h'{x-) < 0 , there are two possibilities: either h{x-) > |H(x_)|“^/^ or h{x-) < —|H(a;_)|“^/^. 
We start with the first case: 

1. If h{x-) > |r2(a;_)|“^/^ = A(a;_), then h{x) is decreasing at a; = a;_ and will remain 
decreasing for x > X- with h{x) > X{x). The reason for this is that A(a;) is decreasing in J 
and then there can not exist a value Xc > X- such that h(xc) = \{xc) (and then h'{xc) = 0 ). 
Indeed, because h{x-) > X(x-), then h{x~) > X{x~) and therefore h'{xc) < X'{xc) < 0, 
contradicting the fact that h'{xc) = 0 . 

2. If h{x-) < — A(a:_) then, because — A(a:) is increasing, h{x) will remain negative and 
decreasing (|h(a;)| increasing) as long as it is continuous. It may happen that h{x) has a vertical 
asymptote at certain Xao > X- such that h{xf^) = —oo {xoo would be a zero of $'(a;)). In 
that case, if h{x) is dehned for x > Xoo we would have h{x^) = -l-oo and h{x) would remain 
positive and decreasing in the rest of the interval (we are in the case discussed before). □ 
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